Minimum description length (MDL) regularization for online learning
نویسنده
چکیده
An approach inspired by the Minimum Description Length (MDL) principle is proposed for adaptively selecting features during online learning based on their usefulness in improving the objective. The approach eliminates noisy or useless features from the optimization process, leading to improved loss. Several algorithmic variations on the approach are presented. They are based on using a Bayesian mixture in each of the dimensions of the feature space. By utilizing the MDL principle, the mixture reduces the dimensionality of the feature space to its subspace with the lowest loss. Bounds on the loss, derived, show that the loss for that subspace is essentially achieved. The approach can be tuned for trading off between model size and the loss incurred. Empirical results on large scale real-world systems demonstrate how it improves such tradeoffs. Huge model size reductions can be achieved with no loss in performance relative to standard techniques, while moderate loss improvements (translating to large regret improvements) are achieved with moderate size reductions. The results also demonstrate that overfitting is eliminated by this approach.
منابع مشابه
Minimum Description Length Principle
The minimum description length (MDL) principle states that one should prefer the model that yields the shortest description of the data when the complexity of the model itself is also accounted for. MDL provides a versatile approach to statistical modeling. It is applicable to model selection and regularization. Modern versions of MDL lead to robust methods that are well suited for choosing an ...
متن کاملMDL Regularizer: A New Regularizer based on the MDL Principle
This paper proposes a new regularization method based on the MDL (Minimum Description Length) principle. An adequate precision weight vector is trained by approximately truncating the maximum likelihood weight vector. The main advantage of the proposed regularizer over existing ones is that it automatically determines a regularization factor without assuming any specific prior distribution with...
متن کامل23. Bayesian Ying Yang Learning (II): A New Mechanism for Model Selection and Regularization
Efforts toward a key challenge of statistical learning, namely making learning on a finite size of samples with model selection ability, have been discussed in two typical streams. Bayesian Ying Yang (BYY) harmony learning provides a promising tool for solving this key challenge, with new mechanisms for model selection and regularization. Moreover, not only the BYY harmony learning is further j...
متن کاملMulti-regularization Parameters Estimation for Gaussian Mixture Classifier based on MDL Principle
Regularization is a solution to solve the problem of unstable estimation of covariance matrix with a small sample set in Gaussian classifier. And multi-regularization parameters estimation is more difficult than single parameter estimation. In this paper, KLIM_L covariance matrix estimation is derived theoretically based on MDL (minimum description length) principle for the small sample problem...
متن کاملPaper Learning Bayesian Belief Networks Based on the Minimum Description Length Principle: Basic Properties
SUMMARY This paper addresses the problem of learning Bayesian belief networks (BBN) based on the minimum description length (MDL) principle. First, we give a formula of description length based on which the MDL-based procedure learns a BBN. Secondly, we point out that the diierence between the MDL-based and Cooper and Herskovits procedures is essentially in the priors rather than in the approac...
متن کامل